Incidental supervision from language has become a popular approach for learning generic visual representations that can be prompted to perform many recognition tasks in computer vision. We conduct an in-depth exploration of the CLIP model and show that its visual representation is often strongly biased towards solving some tasks more than others. Moreover, which task the representation will be biased towards is unpredictable, with little consistency across images. To resolve this task bias, we show how to learn a visual prompt that guides the representation towards features relevant to their task of interest. Our results show that these visual prompts can be independent of the input image and still effectively provide a conditioning mechanism to steer visual representations towards the desired task.
translated by 谷歌翻译
所有物理定律都被描述为状态变量之间的关系,其提供相关系统动态的完整和非冗余描述。然而,尽管计算功率和AI的普及,但识别隐藏状态变量本身的过程已经抵制了自动化。用于建模物理现象的大多数数据驱动方法仍然假设观察到的数据流已经对应于相关状态变量。关键挑战是仅给予高维观察数据,从头开始识别可能的状态变量集。在这里,我们提出了一种新的原理,用于确定观察到的系统可能具有多少状态变量,以及这些变量可以直接来自视频流。我们展示了使用各种物理动态系统的视频录制的这种方法的有效性,从弹性双摆到火焰。如果没有任何相关的物理知识,我们的算法发现观察到的动态的内在尺寸,并识别候选州变量集。我们建议这种方法可以帮助促进对越来越复杂的系统的理解,预测和控制。项目网站是:https://www.cs.columbia.edu/~bchen/nebural-tate-variables
translated by 谷歌翻译
There has been a concurrent significant improvement in the medical images used to facilitate diagnosis and the performance of machine learning techniques to perform tasks such as classification, detection, and segmentation in recent years. As a result, a rapid increase in the usage of such systems can be observed in the healthcare industry, for instance in the form of medical image classification systems, where these models have achieved diagnostic parity with human physicians. One such application where this can be observed is in computer vision tasks such as the classification of skin lesions in dermatoscopic images. However, as stakeholders in the healthcare industry, such as insurance companies, continue to invest extensively in machine learning infrastructure, it becomes increasingly important to understand the vulnerabilities in such systems. Due to the highly critical nature of the tasks being carried out by these machine learning models, it is necessary to analyze techniques that could be used to take advantage of these vulnerabilities and methods to defend against them. This paper explores common adversarial attack techniques. The Fast Sign Gradient Method and Projected Descent Gradient are used against a Convolutional Neural Network trained to classify dermatoscopic images of skin lesions. Following that, it also discusses one of the most popular adversarial defense techniques, adversarial training. The performance of the model that has been trained on adversarial examples is then tested against the previously mentioned attacks, and recommendations to improve neural networks robustness are thus provided based on the results of the experiment.
translated by 谷歌翻译
The findable, accessible, interoperable, and reusable (FAIR) data principles have provided a framework for examining, evaluating, and improving how we share data with the aim of facilitating scientific discovery. Efforts have been made to generalize these principles to research software and other digital products. Artificial intelligence (AI) models -- algorithms that have been trained on data rather than explicitly programmed -- are an important target for this because of the ever-increasing pace with which AI is transforming scientific and engineering domains. In this paper, we propose a practical definition of FAIR principles for AI models and create a FAIR AI project template that promotes adherence to these principles. We demonstrate how to implement these principles using a concrete example from experimental high energy physics: a graph neural network for identifying Higgs bosons decaying to bottom quarks. We study the robustness of these FAIR AI models and their portability across hardware architectures and software frameworks, and report new insights on the interpretability of AI predictions by studying the interplay between FAIR datasets and AI models. Enabled by publishing FAIR AI models, these studies pave the way toward reliable and automated AI-driven scientific discovery.
translated by 谷歌翻译
We discuss a platform that has both software and hardware components, and whose purpose is to support research into characterizing and mitigating the sim-to-real gap in robotics and vehicle autonomy engineering. The software is operating-system independent and has three main components: a simulation engine called Chrono, which supports high-fidelity vehicle and sensor simulation; an autonomy stack for algorithm design and testing; and a development environment that supports visualization and hardware-in-the-loop experimentation. The accompanying hardware platform is a 1/6th scale vehicle augmented with reconfigurable mountings for computing, sensing, and tracking. Since this vehicle platform has a digital twin within the simulation environment, one can test the same autonomy perception, state estimation, or controls algorithms, as well as the processors they run on, in both simulation and reality. A demonstration is provided to show the utilization of this platform for autonomy research. Future work will concentrate on augmenting ART/ATK with support for a full-sized Chevy Bolt EUV, which will be made available to this group in the immediate future.
translated by 谷歌翻译
队列智能或CI是这种新型优化算法之一。自成立以来,在很短的范围内成功地应用于各个领域,并且观察到与同类算法相比,其结果是有效的。到目前为止,在CI及其相关应用程序上还没有进行过这种类型的文献计量分析。因此,对于那些希望将CI提升到新水平的人来说,这篇研究论文将是破冰船。在这篇研究论文中,Scopus中可用的CI出版物通过图表,有关作者,源标题,关键字的网络图进行分析,这些年来,期刊和期刊。在某种程度上,该文献计量学论文以其文献计量详细信息来展示CI,其应用和详细的系统审查。
translated by 谷歌翻译
我们提出了广义的概率U-NET,该概率U-NET通过将高斯分布的更通用形式作为潜在空间分布来扩展概率的U-NET,可以更好地近似参考分段中的不确定性。我们研究了潜在空间分布的选择对使用LIDC-IDRI数据集捕获参考分割中的不确定性的效果。我们表明,分布的选择会影响预测的样本多样性及其相对于参考分割的重叠。对于LIDC-IDRI数据集,我们表明,使用高斯人的混合物会导致广义能量距离(GED)度量相对于标准概率U-NET的统计显着改善。我们已经在https://github.com/ishaanb92/generalizedprobabilisticunet上提供了实施。
translated by 谷歌翻译
深度学习技术在检测医学图像中的对象方面取得了成功,但仍然遭受虚假阳性预测,可能会阻碍准确的诊断。神经网络输出的估计不确定性已用于标记不正确的预测。我们研究了来自神经网络不确定性估计的功能和基于形状的特征,这些特征是根据二进制预测计算出的,从二进制预测中,通过开发基于分类的后处理步骤来减少肝病病变检测中的假阳性,以用于不同的不确定性估计方法。我们证明了两个数据集上所有不确定性估计方法的神经网络的病变检测性能(相对于F1分数)的改善,分别包括腹部MR和CT图像。我们表明,根据神经网络不确定性估计计算的功能往往不会有助于降低假阳性。我们的结果表明,诸如阶级不平衡(真实假阳性比率)和从不确定性图提取的基于形状的特征之类的因素在区分假阳性和真实阳性预测方面起着重要作用
translated by 谷歌翻译
我们描述了一个软件框架和用于串联的硬件平台,用于设计和分析模拟和现实中机器人自主算法。该软件是开源的,独立的容器和操作系统(OS)的软件,具有三个主要组件:COS ++车辆仿真框架(Chrono)的ROS 2接口(Chrono),该框架提供了高保真的轮毂/跟踪的车辆和传感器仿真;基于ROS 2的基本基于算法设计和测试的自治堆栈;以及一个开发生态系统,可在感知,状态估计,路径计划和控制中进行可视化和硬件实验。随附的硬件平台是1/6刻度的车辆,并具有可重新配置的用于计算,传感和跟踪的可重新配置的安装。其目的是允许对算法和传感器配置进行物理测试和改进。由于该车辆平台在模拟环境中具有数字双胞胎,因此可以测试和比较模拟和现实中相同的算法和自主堆栈。该平台的构建是为了表征和管理模拟到现实差距。在此,我们描述了如何建立,部署和用于改善移动应用程序的自主权。
translated by 谷歌翻译
旅行销售人员问题(TSP)是一个经典的资源分配问题,用于找到完成一组任务的最佳顺序,同时最大程度地减少(或最大化)相关的目标函数。它被广泛用于机器人技术,用于诸如计划和计划之类的应用程序。在这项工作中,我们使用增强学习(RL)解决了TSP的两个目标。通常,在多目标优化问题中,相关的目标函数本质上可能是冲突的。在这种情况下,最优性是根据帕累托最优性定义的。目标空间中的这些帕累托最佳解决方案组成帕累托前部(或边境)。每个解决方案都有其权衡。我们介绍了Pareto Frontier近似网络(PA-NET),该网络为Bi-Objective旅行销售员问题(BTSP)生成了良好的Pareto前部近似值。首先,将BTSP转换为受约束的优化问题。然后,我们使用拉格朗日放松和政策梯度来训练我们的网络来解决这一受约束的问题。使用PA-NET,我们改善了现有基于RL的方法的性能。用于测量帕累托阵线最佳性的超量度量的平均改进为2.3%。同时,PA-NET的推理时间更快。最后,我们介绍了PA-NET的应用,以在机器人导航任务/覆盖范围计划中找到最佳的访问顺序。我们的代码可在项目网站上找到。
translated by 谷歌翻译